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1. Introduction 

Let statistical experiment be generated by the couples of observations F*^"^ = 
{Xi,Yi)i=i^,,,n, n e N* where {Xi,Yi) satisfies the equation 

Y, = f{X,)xU,, t^l,...,n. (1.1) 

Here / : [0, 1]"^ — > M is unknown function and we are interested in estimating / 
at a given point y e [0, 1]'' from observation y^"^. 

The random variables (noise) (C/i)igi^....„ are supposed to be independent and 
uniformly distributed on [0, 1]. 

The design points {Xi)i^i^ ^ are deterministic and without loss of generality 
we will assume that 

X, e {l/7ii/^2/ni/'',...,l}', t = l,...,n. 

Along the paper the unknown function / is supposed to be smooth, in particular, 
it belongs to the Holder ball of functions M.d{/3, L, M) (see Definition 1 below). 
Here /3 > is the smoothness of /, M is the sum of upper bounds of / and its 
partial derivatives and L > is Lipschitz constant. 
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Moreover, we will consider only the functions / separated away from zero by 
some positive constant. Thus, from now on we will suppose that there exists 
< A< M such that / e Hd(/3, L, M, A), where 

Ud{P,L,M,A) = \geMd{l3,L,M): inf g{x) > a] . 

Motivation. The theoretical interest to the multiplicative regression model 
(1.1) with discontinuous noise is dictated by the following fact. The typical 
approach to the study of the models with multiplicative noise consists in their 
transformation into the model with an additive noise and in the application, 
after that, the linear smoothing technique, based on standard methods like 
kernel smoothing, local polynomials etc. Let us illustrate the latter approach 
by the consideration of one of the most popular non-parametric model namely 
multiplicative gaussian regression 

z = l,...,n. (1.2) 

Here ^i, i = 1, . . . ,n are i.i.d. standard gaussian random variables and the goal 
is to estimate the variance (t^(-)- 

Putting F/ — and rji = — 1 one can transform the model (1.2) into the 
heteroscedastic additive regression : 

= a^{X,) + a\X,)7^,, z = l,...,n, 

where, obviously, Et^^ = 0. Applying any of the linear methods mentioned above 
to the estimation of (T^(-) one can construct an estimator whose estimation 

accuracy is given by n ^f+i^ and which is optimal in minimax sense (See De- 
finition 2). The latter result is proved under assumptions on (t^(-) which are 
similar to the assumption imposed on the function /(•). In particular, /3 denotes 
the regularity of the function cr^(-). The same result can be obtained for any 
noise variables with known, continuously differentiable density, possessing 
sufficiently many moments. 

The situation changes dramatically when one considers the noise with dis- 
continuous distribution density. Although, the transformation of the original 
multiplicative model to the additive one is still possible, in particular, the mo- 
del (1.1) can be rewritten as 

y/ = fix,) + .f{Xi)n„ y/ = 2y„ 7?, = 2u, - 1, z = 1, . . . , n, 

the linear methods are not optimal anymore. As it is proved in Theorem 2.1 
the optimal accuracy is given by n . To achieve this rate the non-linear 
estimation procedure, based on locally bayesian approach, is proposed in Section 
2. 

Another interesting feature is the selection from given family of estimators 
(see [2], [4]). Such selections are used for construction of data-driven (adaptive) 
procedures. In this context, several approaches to the selection from the family 
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of linear estimators were recently proposed, see for instance [4], [5], [8] and the 
references therein. However, these methods are heavily based on the linearity 
property. As we already mentioned the locally bayesian estimators are non-linear 
and in Section 3 we propose the selection rule from this family. It requires, in 
particular, to develop new non-asymptotical exponential inequalities, which may 
have an independent interest. 

Besides the theoretical interest, the multiplicative regression model is ap- 
plied in various domains, in particular, in the image processing, for example, 
in so-called nonparametric frontier model (see [1], [19]) can be considered as 
the particular case of the model (1.1). Indeed, the reconstruction of the regres- 
sion function / can be viewed as the estimation of a production set V. Indeed, 
Yi < /(^i)) Vz, and, therefore, the estimation of / is reduced to finding the 
upper boundary of P. In this context, one can also cite [11] dealing with the 
estimation of function's support. It is worth to mention that although nonpa- 
rametric estimation in the latter models is studied, the problem of adaptive 
estimation was not considered in the literature. 

MininiELx estimation. The first part of the paper is devoted to the minimax 
over M.d{(3, L, M, A) estimation. This means, in particular, that the parameters 
(3, L, M and A are supposed to be known a priori. We find the minimax rate of 
convergence (1.3) on Hd(/3, L, M, A) and propose the estimator being optimal in 
minimax sense (see Definition 2). Our first result (Theorem 2.1) in this direction 
consists in establishing a lower bound for maximal risk on EIrf(/3, L, M, A). We 
show that for any /3 G R^, the minimax rate of convergence is bounded from 
below by the sequence 



Next, we propose the minimax estimator, i.e. the estimator attaining the 
normalizing sequence (1.3). To construct the minimax estimator we use so- 
called locally bayesian estimation construction which consists in the following. 
Let 



be the neighborhood around y such that Vh{y) C [0, 1]'', where h e (0, 1) is a 
given scalar. Fix an integer number b > and let 



Let Vb ^ {p^ (pi, . . . ,pd) : Pt eN, <\p\ <b} , \p\ = pi + ■ ■ ■ + pd, we 
define the local polynomial 



(1.3) 



d 
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where — ■ ■ ■ z^^ for z — (zi, . . . , z^ and I denotes the mdicator functfon. 
The local polynomial /t can be viewed as an approximation of the regression 
function / inside of the neighborhood Vh and Dh the number of coefficients of 
this polynomial. Introduce the following subset of M^*" 

e(A,M) - {t eM^" : 2io,...,o-||i||i > A ||t||i < M} , (1.5) 

where is Zi-norm on M^*". 8(A,M) can be viewed as the set of coefficients t 
such that A < ft{x) < M for all t € 6(^4, M) and for all x in the neighbourhood 
Vhijj). Consider the pseudo likelihood ratio 

L.(t,r("))= n [M^^)r\.Mxo](^^' teQ{A,M). 
Set also 

TTh{t)^ [ tee{A,M). (1.6) 

Je{A,M) 

Let 9{h) be the solution of the following minimization problem : 

9(h) = arg min nh(t). (1.7) 

tee(A,M) 

The locally bayesian estimator f^{y) of /(y) is defined now as f^{y) = 6*0.. ..,o(^)- 
Note that this local approach allows to estimate successive derivatives of func- 
tion /. In this paper, only the estimation of / at a given point is studied. 

We note that similar locally parametric approach based on maximum like- 
lihood estimators was recently proposed in [9] and [18] for regular statistical 
models. But when the density of observations is discontinuous, the bayesian ap- 
proach outperforms the maximum likelihood estimator. This phenomenon is well 
known in parametric estimation (see [6] ) . Moreover, the establishing of statistical 
properties of bayesian estimators requires typically much weaker assumptions 
than whose used for analysis of maximum likelihood estimators. 

As we see our construction contains an extra-parameter h to be chosen. To 
make this choice we use quite standard arguments. First, we note that in view of 
the definition of Hd(^, L, M) (below in Definition 1), we have V/ e Hd(/3, L, M), 

36 = 9{f,y,h) e hAf,M]^^ : sup \f{x) - fe{x)\ < Ldh^ 

Remark that if / G Hd(/3,L,M), then 6 e e{A,M). Thus, if h is chosen sufli- 
ciently small, our original model (1.1) is well approximated inside of Vh{y) by 
the "parametric" model 

- feiX^) X C/„ 1 = 1,..., nh'', nh'' € N* 

in which the bayesian estimator 9 is rate-optimal (See Theorem 2.2). 
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It is worth mentioning that the analysis of the deviation of (^i, 
from ) is not simple. Namely here requirements < A < f{x) < A/, Vx € 

[0, 1]'', are used. This assumption, which seems not to be necessary, allows us to 
make the presentation of basic ideas clear and to simplify routine computations 
(see also Remark 1). 

Finally, h — hn{li, L) — (Ln)~^/*^''+''^ is chosen as the solution of the following 
minimization problem 

Ldh^ + \/nh'^ -^xmn (1.8) 

h 

and we show that corresponding estimator /''"('^'^)(y) is minimax for f{y) on 
EId(/3, L,M, A) ii P <h (see Theorem 2.2). Since the parameter 6 > can be 
chosen in arbitrary way, the proposed estimator is minimax for any given value 
of the parameter /3 > 0. 

We remark that in regular statistical models, where linear methods are usually 
optimal, the choice of the bandwidth h is due to the relation 

— > min, 

h 

with the solution = {Ln)^^^''^^'^'^\ This explains that the improvement of 
the rate of convergence, 

(l/,i)/3/(/3+d) compared to {l/nf^'-^^+'^\ in the model 
with the discontinuous density. 

Adaptive estimation. The second part of the paper is devoted to the adap- 
tive minimax estimation over collection of isotropic functional classes in the 
model (1.1). At our knowledge, the problem of adaptive estimation in the mul- 
tiplicative regression with the noise, having discontinuous density, is not studied 
in the literature. 

Well-known drawback of minimax approach is the dependence of the minimax 
estimator on the parameters describing functional class on which the maximal 
risk is determined. In particular, the locally bayesian estimator /'*(•) depends 
obviously on the parameters A and M via the solution of the minimization 
problem (1.7). Moreover hn{/3,L) optimally chosen in view of (1.8) depends ex- 
plicitly on /3 and L. To overcome this drawback the minimax adaptive approach 
was proposed (see [12], [13], [16]). The first question arising in the adaptation 
(reduced to the problem at hand) can be formulated as follows. 

Does there exist an estimator which would be minimax on IEII(/3, L, M, A) si- 
multaneously for all values of /3, L, A and M belonging to some given subset of 

rX ? 

In section 3, we show that the answer to this question is negative, that is 
typical for the estimation of the function at a given point (see [15], [20], [21]). 
This answer can be reformulated in the following manner : the family of rates 
of convergence {(pn{(3), /3 S is unattainable for the problem under conside- 
ration. 
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Thus, we need to find another family of normalizations for maximal risk 
which would be attainable and, moreover, optimal in view of some criterion 
of optimality. Nowadays, the most developed criterion of optimality is due to 
Klutchnikoff [10]. 

We show that the family of normalizations, being optimal in view of this 
criterion, is 

^.m-{^f. + 

whenever /3 e]0, b]. The factor p„ can be considered as price to pay for adaptation 
(see [13]). 

The most important step in proving the optimality of the family (1.9) is to 
find an estimator, called adaptive, which attains the optimal family of norma- 
lizations. Obviously, we seek an estimator whose construction is parameter-free, 
i.e. independent of /3, L, A and M. In order to explain our estimation procedure 
let us make several remarks. 

First we note that the role of the constants A, M and /3, L in the construction 
of the minimax estimator is quite different. Indeed, the constants A, M are used 
in order to determine the set Q{A, M) needed for the construction of the locally 
bayesian estimator, see (1.6) and (1.7). However, this set does not depend on 
the localization parameter /i > 0, in other words, the quantities A and M are 
not involved in the selection of optimal size of the local neighborhood given by 
(1.8). Contrary to that, the constants /3,L are used for the derivation of the 
optimal size of the local neighborhood (1.8), but they are not involved in the 
construction of the collection of locally bayesian estimators {/'', /i > O}. 

Next remark explains how to replace the unknown quantities A and M in the 
definition of 8 [A, M) . Our first simple observation consists in the following : 
the estimator f^^iP^^) remains minimax if we replace 0(A,M) in (1.6) and 
(1.7) by e(i,M) with any < i < ^ and M < M < oo. It follows from 
obvious inclusion M.ii{f3, L, A, M) C ]HIrf(/3, L, A, M). The next observation is less 
trivial and it follows from Proposition 1. Put /i,nax = and define for any 

function / 



b 

A{f)= jni fix), M{f)=J2 E 

rn=0 pi+...+Pd=m 



d^fiv) 



dx{' ■ ■ ■ dxP/ 



(1.10) 



The following agreement will be used in the sequel : if the function / and m > 1 
be such that 9™/ does not exist we will put formally d"^f — in the definition 
ofM(/). 

It remains to note that contrary to the quantities A and M the functionals 
A{f) and M{f) can be consistently estimated from the observation (1.1) and 
let A and M be the corresponding estimators. The idea now is to determine the 
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collection of locally bayesian estimators {f'^,h > 0} by replacing 9(^1, M) in 
(1.6) and (1.7) by the random parameter set O which is defined as follows. 

e = e(i/2,4A#) = |t e M^" : 2to,...,o - > ||i||i<4M|. 

In this context it is important to emphasize that the estimators A and M are 
built from the same observation which is used for the construction of the family 
{f\h>0}. 

Contrary to all saying above, the constants /3 and L cannot be estima- 
ted consistently. In order to select an "optimal" estimator from the family 
{f'^,h > 0} we use general adaptation scheme due to Lepski [12], [14]. To 
the best of our knowledge it is the first time when this method is applied in 
the statistical model with multiplicative noise and discontinuous distribution. 
Moreover, except already mentioned papers [9] and [18], Lepski's procedure is 
typically applied to the selection from the collection of linear estimators (kernel 
estimators, locally polynomial estimator, etc.). In the present paper we apply 
this method to very complicated family of nonlinear estimators, obtained by the 
use of bayesian approach on the random parameter set. It required, in particu- 
lar, to establish the exponential inequality for the deviation of locally bayesian 
estimator from the parameter to be estimated (Proposition 1). It generalizes the 
inequality proved for the parametric model (see [6] Chapter 1, Section 5), this 
result seems to be new. 

Simulations. In the present paper we adopt the local parametric approxi- 
mation to a purely non parametric model. As it proved, this strategy leads to 
the theoretically optimal statistical decisions. But the minimax as well as the 
minimax adaptive approach are asymptotical and it seems natural to check how 
proposed estimators work for reasonable sample size. In the simulation study, 
we test the bayesian estimator in the parametric and nonparametric cases. We 
show that the adaptive estimator approaches the oracle estimator. The oracle 
estimator is selected from the family |/'', h > o| under the hypothesis / that 
is known. We show that the bayesian estimator performs well starting with 
n > 100. 

This paper is organized as follows. In Section 2 we present the results concer- 
ning minimax estimation and Section 3 is devoted to the adaptive estimation. 
The simulations are given in Section 4. The proofs of main results are proved in 
Section 5 (upper bounds) and section 6 (lower bounds). Auxiliary lemmas are 
postponed to Appendix (Section 7) contains the proofs of technical results. 

2. Minimax estimation on isotropic Holder class 

In this section we present several results concerning minimax estimation. 
First, we establish lower bound for minimax risk defined on EId(/3, L, M, A) for 
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any (5,L,M and A. For any {j)i,...,p^) e N"* we denote p = {pi,...,pd) and 
\p\ =Pi + ■■■+Pd- 

Definition 1. Fix P > 0, L > and M > and let [(3\ be the largest integer 
strictly less than /3. The isotropic Holder class Md{f3, L, M) is the set of functions 
f : [0, l]'^ — ?> M having on [0, l]'' all partial derivatives of order [/3J and such that 



9xf • • • dxP" 



E 



n 



dy^' ■ ■ ■ dy^/ 



Pj\ 



< 



yj\ 



where Xj and yj are the jth components of x and y. 

This definition implies that if / e Md{f3, L, M, A) (defined in the beginning of 
this paper), then A < A{f) and M{f) < M, where A{f) and M{f) are defined 
in (1.10). 

Maximal and minimax risk on M.d{l3, L, M, A). To measure the perfor- 
mance of estimation procedures on MdiP , L , M , A) we will use minimax ap- 
proach. 

Let E/ = be the mathematical expectation with respect to the probability 
law of the observation y'"^ satisfying (1.1). We define first the maximal risk on 
M.d{l3, L, M, A) corresponding to the estimation of the function / at a given 
point y e [0, 1]''. 

Let / be an arbitrary estimator built from the observation F^"). Let Vg > 
Rn.,[fMP,L,M,A)]= sup E/|/(y)-/(y)|'. 

fmoiiP,L,M,A) 

The quantity i?„^g [/, Hrf(/3, L, Af, A)] is called maximal risk of the estimator / 
on ]HId(/3, L, M, A) and the minimax risk on M.d{f3, L, M, A) is defined as 

[Hd(/3, L, M, A)] = inf i?„,, [/, Hd(/3, L, M, A)] , 
/ 

where inf is taken over the set of all estimators. 

Definition 2. The normalizing sequence ipn is called minimax rate of conver- 
gence (MRT) and the estimator f is called minimax (asymptotically minimax) 
if 

liminfV'-«i?„.,[/,Hd(/3,L,M,A)] > 0; 
limsupV^«i?„,,[/,Hd(/3,L,M,A)] < cx). 
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Theorem 2.1. For any /3 > 0, L > 0, M > 0, A > 0, q > 1 and d>l 



liminf (^-?(/3)i?„,,[Hrf(^,L,M, A)] > 0, = n'^-^. 

Remark 1. The obtained result shows that on Hjj(/3, L, M, A) the minimax rate 

of convergence cannot be faster than n . In view of the obvious inclusion 

Ilj_(/3, L, AI, A) C Mii{/3, L, M) the minimax rate of convergence on an isotropic 

p_ 

Holder class is also bounded from below by n . 

The next theorem shows how to construct the minimax estimator basing on 
locally bayesian approach. Put h — {Ln)~'^ and let f^iy) — ^o,...,o(/i) is given 
by (1.5), (1.6) and (1.7) with /i = /i. 

Theorem 2.2. Let f) > 0, L > 0, M > and A > be fixed. Then there exists 
the constant C* such that for any n € N* satisfying nW^ > (L/^J + l) 

^n'{P) Rn,,[f^{y),W,L,M,A)\ < C\ Vg > 1. 
The explicit form of C* is given in the proof. 

Remark 2. We deduce from Theorems 2.1 and 2.2 that the estimator f^{y) is 
minimax on IHIrf(/3, L, M, A) . 

3. Adaptive estimation on isotropic Holder classes 

This section is devoted to the adaptive estimation over the collection of the 
classes < M.d(0, L, M, A) \ . We will not impose any restriction on possible 

1^ J f},L,M,A 

values of L, M, A, but we will assume that /3 e (0, 6], where b, as previously, is 
an arbitrary a priori chosen integer. 

We start with formulating the result showing that there is no optimally adap- 
tive estimator (here we follow the terminology introduced in [13], [14]). It means 
that there is no an estimator which would be minimax simultaneously for seve- 
ral values of parameter /? even if all other parameters L, M and A are supposed 
to be fixed. This result does not require any restriction on /3 as well. 

Theorem 3.1. For any B C M+ \ {0} such that card{M) > 2, for any /32 e B 

and any L > 0, M > 0, A > 



lim inf inf 



^-«(/3i)i?„,,(/,H<i(/3i,i,M,A)) 
-^;:«(/32)i?„.,(/,Hrf(/32,i,M,A)) 



= +0O, 



where inf is taken over all possible estimators. 

The assertion of Theorem 3.1 can be considerably specified if B = (0, b]. To 
do that we will need the following definition. Let = {V-'n(/3)}^g(o 6] ^'^ ^ given 
family of normalizations. 
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Definition 3. The family is called admissible if there exist an estimator /„ 
such that for some L > 0, M > and A > 

limsupC''(/3)^n.,(/,Hd(/3,L,Af,A)) <oo, V/3e(0,fe]. (3.1) 

n— f oo 

The estimator /„ satisfying (3.1) is called "i! -attainable. The estimator /„ is 
called "ii-adaptive if (3.1) holds for any L > 0,M > and A> 0. 

Note that the result proved in Theorem 3.1 means that the family of rates of 
convergence {'PniP)} p^z^^Q is not admissible. Denote by <& the following family 
of normalizations : 

Uf^)^(^)'^\ PnW)^i + in(^), pern. 



We remark that (j)n{b) = ^n{b) and Pn{(3) ^ Inn for any l3 ^ b. 

Theorem 3.2. Let 5* = {'0n(/3)}^g(o hj be an arbitrary admissible family of 
normalizations. 

I. For any a £ (0, b] such that ipnia) ^ Lpn(p<), there exists an admissible family 
{^^n(/3)}^g(o,b] V which 

lim Vn{a)tlj^^ {a) = 0. 

n— f oo 

II. // there exists j € (0,5) such that 

lim V'„(7)C'(7)=0, (3.2) 

n—>-oo 

then necessarily 

(a) ]im MP)<j>-\f3) > 0, V/3e(0,7); 



(b) lim 



MP) 



= 0, V/3e(7,6]. 



Several remarks are in order. 

We note that if the family of normalizations $ is admissible, i.e. one can 
construct ^-attainable estimator, then $ is in an optimal family of normaliza- 
tions in view of Kluchnikoff criterion [10]. It follows from the second assertion 
of the theorem. We note however that a ^-attainable estimator may depend on 
L > 0, M > and A > 0, and, therefore, this estimator have only theoretical 
interest. In the next section we construct ^-adaptive estimator, which is, by its 
definition, fully parameter-free. Moreover, this estimator obviously proves that 
$ is admissible, and, therefore, optimal as it was mentioned above. 

The assertions of Theorem 3.2 allows us to give rather simple interpreta- 
tion of Kluchnikoff criterion. Indeed, the first assertion, which is easily deduced 
from Theorem 3.1, shows that any admissible family of normalizations can be 
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improved by another admissible family at any given point a G (0, b] except 
maybe one. In particular, it concerns the family $ if it is admissible. On the 
other hand, the second assertion of the theorem shows that there is no admis- 
sible family which would outperform the family $ at two points. Moreover, in 
view of (b), <I>-adaptive (attainable) estimator, if exists, has the same precision 
on M.ii{/3, L, M, A), /3 < 7, as any \['-adaptive(attainable) estimator whenever 
^E* satisfies (3.2). Additionally, (a) implies that the gain in the precision pro- 
vided by ^-adaptive (attainable) estimator on Mdij, L, M, A) leads automati- 
cally to much more losses on Md{(3, L, M, A) for any /3 > 7 with respect to 
the precision provided by $-adaptive(attainable) estimator. We conclude that 
$-adaptive(attainable) estimator outperforms any \['-adaptive(attainable) esti- 
mator whenever satisfies (3.2). It remains to note that any admissible family 
not satisfying (3.2) is asymptotically equivalent to <&. 

Construction of ^-adaptive estimator. As it was already mentioned in 
Introduction the construction of our estimation procedure consists of several 
steps. First, we determine the set G), built from observation, which is used after 
that in order to define the family of locally bayesian estimators. Next, based on 
Lepski's method (see [13] and [16]), we propose data-driven selection from this 
family. 

First step : Determination of parameter set. Put /imax = ^~ Si,nd let 9 
be the solution of the following minimization problem. 



inf 



n 

E 



iv) 



2Y,-tK' 



X,, 



1 2 



Vniax(2;) = V,i„,^^(?/), 



where the Dfe-dimensional vector K{z) = {z^ : p G Vb) and the sign T below 
means the transposition. Thus, 9 is the local least squared estimator and its 
explicit expression is given by 



= 2 



E 



K 



X,, 



where Y = (Fi, . . . , y„) and /C„(y) = 
design matrix. Put 

~Sp^p,\...pdlh-\p\9p, \p\<b 
Introduce the following quantities 

A = So...,o, M=\\S\\^, 

and define the random parameter set as follows. 



^Vn,^^iv)iXi) 



is the 



(3.3) 



e = \te 



2tn 



\t\\i > 2"M, ||t||i < 4Af 



(3.4) 
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TTh{t) - \\t-u\\iLh{u,Y<-''^)du; (3.5) 



0*{h) — argniiii^/i(i). (3.6) 
tee 

The family of locally bayesian estimator T is defined now as follows. 

^ = {fHv) - Ol..,oW, h e (0, Vax] } . (3.7) 
Third step : Data-driven selection from the collection J- . Put 

hk — 1 /imaxj fc = 0, . . . , k„, 
where k„ is smallest integer such that /ik„ > /imin = In^cn^ n^^^'^ . Set 

= {/^''(y) = ^5,...,o(/»'o), fc = 0, . . . ,k„} . 

We put /*(y) = f''^\y), where f^''\y) is selected from ^* in accordance with 
the rule : 

fc = inf{fc = OX: |/W(y)-/W(y)| <Af5„(0' ^-fc + l,kn}. (3.8) 
Here we have used the following notations. 

1 + Hn 2 

and \n{h) is the smallest eigenvalue of the matrix 



S^[l) = 4:i2Dl{i2qd+l&) \-\hi) 



I — 0, 1, ... , k„, 



MnHiv) - ;iE^" (\^) (\^) (3.9) 

which is completely determined by the design points and by the number of 
observations. We will prove that there exists a nonnegative real A, such that 
> A for any n > 1 and any h £ [/imim ^max] (see Lemma 2). 

Theorem 3.3. Let an integer number b > be fixed. Then for any (3 £ (0,6], 
L > 0, M > 0,A> and q>l 



lim sup </>-«(/?) Rn,, f*iy)MP,L,M,A) 



< CO. 



Remark 3. The assertion of the theorem means that the proposed estimator 
f*{y) is '^-adaptive. It implies in particular that the family of normalizations $ 
is admissible. This, together with Theorem 3.2 allows us to state the optimality 
of ^ in view of Kluchnikoff criterion (see [10]). 
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4. Simulation study 

We will consider the case d = 1. The data are simulated accordingly to the 
model (1.1), where we use the following functions (Figure 1). 




02 0.4 06 08 1 '0 02 04 0.6 08 1 '0 02 04 06 0.8 1 



Figure 1. Test functions. 

Here fi{x) = cos(27rx) + 2, f2{x) = 2 J[^<i/3] + l.I[i/3<a.<2/3] + 3.1[2/3<x] and 
/a (a;) = cos(27rx) + 2 + 0.3 sin(197ra;) 

To construct the family of estimators we use the linear approximation (b = 2), 
i.e. within the neighbourhoods of the given size h, the locally bayesian estimator 
has the form 

f''{x) = §o + 0ix, xe[0,l]. 

We define the ideal (oracle) value of the parameter h = h{f) as the minimizer 
of the risk : 

h = avg inf Ef\fHy)-f{y)\. 

/ie|l/n,lj 

To compute it we apply Monte-Carlo simulations (10000 repetitions). Our first 
objective is to compare the risk provided by the "oracle" estimator f^{-) and 
whose provided by the adaptive estimator from Section 3. Figure 2 shows the 
deviation of the adaptive estimator from the function to be estimated. In several 
points, for example in y = 1/2, we remark so-called over-smoothing phenome- 
non, inherent to any adaptive estimator. 




Figure 2. Examples of estimation with n = 100. 
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Oracle-adaptive ratio. We compute the risks of the oracle and the adaptive 
estimator in 100 points of the interval (0, 1). The next tabular presents the mean 
value of the ratio oracle risk/adaptive risk calculated for the functions /i,/2,/3 
and n = 100, 1000. 





n = 100 


n = 1000 


function 


adaptive 
risk 


oracle-adaptive ra- 
tio 


adaptive 
risk 


oracle-adaptive ra- 
tio 


/i 


0.13 


0.84 


0.03 


0.85 


/2 


0.3 


0.71 


0.1 


0.75 


fa 


0.28 


0.65 


0.2 


0.68 



Figure 3. Numeric values of risk. 



Figure 4 presents the "oracle risk/adaptive risk" ratio as the function of the 
number of observations n. 



Efficiency for f ^ Effitiencv for f^ Efficiency for f ^ 




200 400 600 800 1000 200 400 SOO 800 1000 200 400 600 800 1000 



Figure 4. Efficiency of bayesian estimator for three test functions. 



Adaptation versus parametric estimation. We consider the function 
(figure 5), which is linear inside the neighborhood of size /i* = 1/8 around 
point 1/2 and simulate n = 1000 observations in accordance with the model 
(1.1). Using only the observations corresponding to the interval [3/8,5/8] we 
construct the bayesian estimator /^/®(l/2). 

It is important to emphasize that this estimator is efficient [6] since the model 
is parametric. Our objective now is to compare the risk of our adaptive estimator 
with the risk provided by the estimator /^/^(l/2). We also try to understand how 
far is the localization parameter hj, , inherent to the construction of our adaptive 
estimator, from the true value 1/8. We compute the risk of each estimator via 
Monte-Carlo method with 10000 repetitions. For each repetition the procedure 
select the adaptive bandwidth h^^\ j = 1, 10000. We confirm once again the 
over-smoothing phenomenon since 



hy' - 0.1405 >K^ 0.1250, j = 1, 10000. 
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locally parametric iny-1/2 with h -1/8 




Figure 5. local parametric test function. 



Note however that the adaptive procedure selects the neighborhood of the size 
which is quite close to the true one. We also compute the risks of both estima- 
tors : "bayesian risk"=0.0206 and "adaptive risk"=0.0308. We conclude that the 
estimation accuracy provided by our adaptive procedure is quite satisfactory. 



5. Proofs of main results : upper bounds 

Let Hn,n > 1 be the following subinterval of (0, 1). 



'Hn — 



(&+ 1) V (Inn) (^+^ 



I \ b + d 

Inrt 



(5.1) 



Later on we will consider only the values of h belonging to Hn- We start with 
establishing the exponential inequality for the deviation of locally bayesian es- 
timator f^iy) from f{y). The corresponding inequality is the basic technical to 
allowing to prove minimax and minimax adaptive results. 



5.1. Exponential Inequality 

Introduce the following notations. For any h € "Hm Put lo = uj[f,y,h) 
{ujp : p e Vb}, where cjq = wo,...,o = f{y) and 



Remind the agreement which we follow in the present paper : if the function / 
and vector p are such that 9'^'/ does not exist we put ujp = 0. 
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Let fuj{x), given by (1.4), be the local polynomial approximation of / inside 
Vh{y) and let bh be the corresponding approximation error, i.e. 

bh^ sup \f^{x) - f{x)\. (5.3) 

If / e M.d{f3, L, M), /3 > 0, one could remark that bh < Ldh^ by definition of uj 
in (5.2) and ]Hd(/3,L,M) in Definition 2. Put also 



( 1 + GD^W h 
QAif)Dl 



Afh=bhxnh'', £{h)^exp\ ' (5.4) 



Introduce the random events Gj^^ = {\M - M{f)\ < M(/)/2} and = 
{\A- A{f)\ < A{f)/2} and put G = G^-^ n G^ where A and M are defined in 
(3.3), Section 3. 

Recall that A„(/i) (see Section 3) is the smallest eigenvalue of the matrix 



h<^^ \ h \ h 

i=l ^ ^ ^ 

and K{z) is the _Df,-dimensional vector of the monomials z^, p e Vt- 

Proposition 1. For any h e Hn and any f such that A{f) > A and M{f) < 
M, then Ve > lUMDb{l\J Nh) / AX^Qi) 



P/ {nh''\f''{y)~ f{y)\>e, g) <S(A(/),Af(/))£(/i)exp 



An(fe) £ 

'432M(/) Dll' 



where j{y) G JF defined in (3.7). The explicit expression of the function S(-, •) 
is given in the beginning of the proof of the proposition. 

The next proposition provides us with upper bound for the risk of a locally 
bayesian estimator. 

Proposition 2. For any n e N* , h e Hn and any f e Md{/3, L, AI, A), then 
3A > such that A„(/i) > A and 



Ef\f''{y)-f{y)\'lG<C;{A{f),M{f)) 
where 



\y Ldnhf^+'^Y 
hd ' 



G*{a,m) = ^ 



432mDf(l + 6D^) 



(a, m)r{q), a, m > 0, 



r(-) is the well-known Gamma function. 

Remark 4. The analysis of the proof of Proposition 1 allows to assert the 
following inequality 



f 



ih'^lf'^iy)- fiy)\>e) <^{A,M)£{h)cxp(^-^^Y 
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where f^{y) is locally bayesian estimator which is the minimizer in (1-6). 

Thus, the latter inequality can he viewed an analogue of the result of Propo- 
sition 1 when A and M are known. By the same reasons, we have 



E/|/'(y)-/(y)r<q(AAO 



IV Ldnh'^+'^' 
nh<^ 



q>l. 



5.2. Proof of Proposition 1 

Before to start with the proof, let us breafly discuss its ingredients. 
Discussion. 

I. First, the obvious inclusion (remind that 9*{h) minimizes Tih defined in 
(3.5)) 

\nh''\\§*{h)-9\\>e]cl inf n^it) < frh{e)] . 
L II ^ ' 111 J [„/i<i||f-e||i>e ' 'J 

allows us to reduce the study of the deviation of 9*{h) from 9 to the study of 
the behaviour of tt/j. 

II. We note that TTh is the integral functional of the pseudo- likelihood Lh- As 
the consequence, the behaviour of nh is completely determined by this process. 
Following [6] (Chapter 1, Section 5, Theorem 5.2), where similar problems were 
studied under parametric model assumption, we introduce the stochastic process 

^ ^ ^ Lh{9 + inh'')-^u,Y'^^^) 
Zh,e{u) = - ' 



Lh{9,Yi^)) 

defined on T„ = {u € M^" : u = nh'^{t -9),te Q{A{f)/4, 9M{f))} . 
Here, the vector 9 = 9{f, y, h) — {9p : p ^Vb} is defined as follows. 

^0 = 6'o,...,o = '^o + Op^ujp, \p\^Q, 

where w is the coefficients of Taylor polynomial defined in (5.2). The definition 
of bfi implies obviously 

feix) > fix), yxeVhiy). (5.5) 

As it was noted in [G] (Chapter 1, Section 5, Theorem 5.2) the following pro- 
perties of the process Z^.e are essential for the study of tt/j : 

- Holder continuity of its trajectories ; 

- the rate of its decay at infinity. 

The exact statements are formulated in Lemma 1 below. 

III. As it was shown in [C] (Chapter 1, Section 5, Theorem 5.2) in parame- 
tric situation the mentioned above properties Zh,e provide with the desirable 
properties of the process 

^''(") = ^^^TXT' ue%:=nh'{e-9), 
Zh,e(v)dv 
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where the set G is defined in (3.4). The exact statements are given in Assertions 
1 and 2. The latter process is important in view of the following inclusion 



{nh''\fHy) - fiy)\ > 4 - I /- \Mi^hiu)du > 



r 

\\u\\izii(u)du > 

t„(r) 

Auxiliary Lemma. First, we note that in view of (5.5), the event Yi < fe{Xi) 
is always realized, because Yi < f{Xi) < fg{Xi). Hence, Z^^e can be rewritten 

Zh.e{u) = TT -^g^^') I ueT„- (5.6) 

Lemma 1. For any f e W.d{/3, L, M, A) and h E Tin 

1. sup \\ui-U2\\i^V.f\Zh,e{ui)- Zhfi{u2)\<Ch, 

lll,M2£T„ 

2. E/Z^J(w) < e-»''(ll"llO, VueT„, 

5. P/ J / Zh.e{u)du < ^ i < 2Ch5, yS > 0. 
where 

\n{h)a Mh 



Ch = 8(1 V DtA-\f)) exp {1 + A6,M(/)} , gh{a) 



mMif)D, Aif)' 



with a > and A„(/i) is the smallest eigenvalue of the matrix Mnhiv) defined 
in (3.9). 

Proof of Proposition 1. Define for any m > and u > 

2(0,771) = supl6e(lVAa-i)l](m)[S, +6] exp (-—^j, (5.7) 



where = z^^^^ + 2(2z + 2)^^" + 5 + [z^^ + (2z + 2) ^ ' j , A > is 

defined such that : A„(/i) > A for any n e N*, /i G "Hn (for more details, see 
Lemma 2) and 



= , , c(7;)=exp{-(547;i?2)-i|^ 

{\-c{y)) 

Assertion 1. For any e > 0, and for all r such that < r < e/3, we assume 
¥f{nh^\f\y)-f{y)\>e,G) < 2P; U ^ MiZh{u)du > "^.g] . 
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Assertion 2. For all h E Hn and any f such that A{f) > A and M{f) < M , 
then for any a > i2MDb{l\/ Mh)/{XA) 



\\u\\iZh{u) dulc 

t„n| ||it||i>a| 



<a^M{J))Ba Cn exp | - (a) 



where gh{-) is defined in Lemma 1. 

1*^. Suppose that Assertions 1 and 2 are proved. Tlien, in view of Assertion 
2, choosing r = e/4 we get 

E/ / \\u\\^Zh{u)lGdu<-^^MU))B,,iChe~^^'^^""^\ 

"'t„n(jl«||i>e/4) 4 

Using the Tchebychev inequality, we have in view of the last inequality 

P/ ( / \\u\\izh{u)du>l, g] <2S](A/(/))S,/4C^e"^''^'^'\ 

\Jt,.n(||u||i>e/4) ° J 

The assertion of Proposition 1 follows now from the last inequality, Assertion 1 
and the definitions oi Ch^gh{') and the function *B(-, •). 

2°. Now, we will prove Assertion 1. The definition of 9* [h) and 9 = 9{f, y, h) 
implies Ve > 

Vf(nh''\f^{y)-f{y)\>e,G) < ¥ f (nh''\9*{h) ~ 9„\ > e,G) 

< Pf (^nh''\\9*{h) -9\\^>e,Gy (5.8) 

Some remarks are in order. First, it is easily seen that 9 G 9(^(/), 3Af(/)) . 
Therefore, if the event G holds then 9 E Q. Remind also that 9*{h) minimizes 
TTh defined in (3.5) and, therefore, the following inclusion holds since 9*{h) E 8. 

\^(nh''\\9*{h)-9\\^ > e) nc} C |(^ ^^^^inf^^ ^ nh{t) < M^)^ n g| . (5.9) 

Moreover, 

TThit) ^ {nh'')-^ [ \\nh'^{t-u)\\ Lh{u, du 
Je 

= {nh'^)-""-^ [ \\nh\t-9)-u\\^Lh{9 + uinh'^)-\Y^''^)du 
= {nh'^)-^''-^Lh{9,Y^"^) ( \\nh'^{t-9)-u\\^Zh,e{u)du. 



Hence, r„ = nh'^{9*{h) — 9) is the minimizer of 

Xn{s) ^ \\s-u\\ - — i-r-^du 
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and we obtain from (5.8) and (5.9) for any e > 

¥f{\\nh''{9*{h)-d)\\^>e,G') < P/ (^^l mf^^Xn(s) < Xn(0), • (5.10) 
Let 0<r<e/3, bea number whose choice will be done later. We have 



Xn(0) < r 
Note also that 



Zh{u)du - 



T„n(||«||i<r) 



\u\\iZh{u)du. 



T„n(||u||i>r) 



inf Xn{s) > inf 

||s||i>e \\s\\i>e 



T„n(|l«|li<r) 



{\\s\W^\\u\\^)zu{u)du 



Zh{u)du. 



T„n(||«||i<r) 



It yields in particular 



X«(0) - inf Xn{s) 

||s||i>e 



<— (e — 2r) / Zh{u)du+ / \\u\\iZh{u)du. 

"'t„n(lltilli<r) "'t„n(||n||i>r) 

Thus, Vr G (0, e/3) 

fxn(0)-„ inf Xn{s)>0,G\ 

\\u\\iZh{u)du > [e — 2r) I Zfi{u)du,G 



T„n(|l«|li>r) 



T„n(|lu|li<r) 



< 



7 

'T„n(||M||i>r) 

+P/ I (e - 2r) 



\\u\\iZhiu)du > r/2,Gj 

Zh{u)du < r/2, G 



(5.11) 



T„n(||M||i<r) 



We note that the second term in (5.11) can be control by the first one whenever 
< r < e/3. Indeed, putting t„(r) = t„ n (u e MP" : \\u\\i > r) we get 



(e-2r) 



Zh{u)du < r/2, G 



T„n(||jj||i<r) 



<Vf\r ZhAv)dv-r Zh,e{u)du < - Zh,9iv)dv,G 



< 



< 



'f\r Zhfi{u)du> - I Zh,e{v)dv,G 



T„(r) 



> / \\u\\iZH{u)du>rl2,G 
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The last inequality together with (5.8), (5.10) and (5.11) yields 

¥f{nh''\f\y)~f{y)\>s,G) <2Vf ^ ^ ^\\u\\,zu{u)du > "^,0 

3 . Now, let us prove Assertion 2. Put T„(a) = T„ n (u G M^" : ||it||i > a) 
for all a > and il^ — T„(u) \ T„(?j + 1) for any v > a. Introduce the following 
notations. 

T f y f \^ n If^nn^^hAu)du 

-Iv = / Zh,9[u)du, ^ — J, — — . 

Jn^ jf„ Zhfi(u)du 

Fix T > whose choice will be done later. Consider the minimal number 
N{yti,,l/T) of balls of radius \/T that are needed to cover the set il^. De- 
note u' is the center of each ball. Since 17^ is a compact of M^'', it implies 
N{Q.^, l/T) < (^;+l)^''T^^ Introduce the non-intersecting parts Ai, A2, A3, . . . 
as follows : Ai = {m G : \\u — u ^lli < and 

A, = {u€i}, : \\u-u^\i<l/T}\\jA,, j = 2, . . . , Ar(f7„, l/T). 

Put Su — Zhfi{u^)du and note that 5*^, is stepwise approximation of I^. 



Control ofXv Remind that Qy = (J^^^^"'^^^-' A^ and denote by |r2i,| the 
volume of il„ . We get for any cr > 

Ff{Sy>a) < ¥f (max z][^, (u^ ) ^J^\ > 

< Y.^f[zl/;{u^)>^\Va). 

3 

Note that the number of summands on the right-hand side of the last inequality 
does not exceed {v + 1)^''T^'' . Applying Tchebychev inequality and Lemma 1 
(2), we obtain 

Pf{Sy >a) <{v + 1)^^^" V?^CT-i/2g-s^M^ (5 ;L2) 

In view of to Lemma 1 (1), 

%|^.-lJ<V/ EflZh.oiu)- Zh,0iu')\du<ChY^ [ \\u-u'\\idu. 

By definition of Aj, each summand does not exceed T~^du, therefore, 

Ef\S^-Iy\<ChV\n^\T~\ (5.13) 
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One has 

P/(I„ > 2a) <Vf{S„ > cr) +P/(|5,^ > a). 
Using (5.12), (5.13) and applying Tchebychev inequaUty, we get 

P/(I, > 2a) < (« + l)^''r^^^I\Jf7-i/2e-9.(-) +C,,^T^r-V-i. (5.14) 

Control of Q^,. Set A = | Zh,e{u)du < ^^''/2|. Since Qv < 1 we obtain 
for any S > 0, a > 

< ¥f (A, G) + ¥f{l„ > 2a) + iS^^'a. 

Under the event G, remark that [0,(5]-°" C nh'^{e{A{f),2M{f )) - 61) C t„ for 
any 6 < (2M{f) — A{f)). Using to Lemma 1 (3) and the inequahty (5.14), we 
have 

Choosing T = exp | 3^3,1 (u)|, a = exp [-^h^ghiv)^ and 6 = exp |-6^.9h(w)|, 
we obtain 

exp < ~-^T^9h{v) 

Conclusion of the proof of Assertion 2. Simplest algebra shows that -^/jf^J < 
{2v + 2) ' , we get 

E/S. < [i;^''+i+2(2u + 2)'''''+5]C/, exp|-^5^(u)|, (5.15) 

Note that if the event G is realized then T„ (a) C T„ (a) = U jlo ^a+j ■ we obtain 
in view of (5.15) 



P 00 
E/ / \\u\\iZh{u)lGdu < y^{a + j + l)EfQa+j 

if„n(||«||i>a) ^.^0 

= E(M(/))aS,C^ exp|-^.9^(a)| 
where we have put Ba ^ a'°''+i + 2(2w + 2)^^" + 5 + A (a^» + {2a + 2)^^'^ 
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To prove the proposition it suffices to integrate the inequahty obtained in 
Proposition 1 and to use the following lemma which will be extensively exploited 
in the sequel. 

Lemma 2. There exists A > such that V?! > 1 and Vft. G T-Ln, we have 

K{h) > A. 

where Xn{h) is the smallest eigenvalue of the matrix 



1 " 



K 



and K{z) is the Df,- dimensional vector of the monomials , p £ Vb- 

Proof of Proposition 2. In order to simplify the proof, let us introduce the 
following constants 



Cl 



(l + 6Dg) 



A 



C2 



By definition of A{f), M{f), Q3(., .) respectively in (1.10), (5.7) and A, M, we 
have the following inequality !B(A(/), M(/)) < «B(A, Af). By integration of 
Proposition 1 and using Lemma 2, we get for any q> I and / G M.d{(3, L, M, A) 

Ef\f\y)-f{y)\'lG 

v'-'Vf{\f'\y)-fiy)\>v,G) dr, 

(nh")-' v'-'Vf (if'^iy) ^ f{y)\ > ^,g) 



-f OO 



r]''-^ dri 

v'-'Pf{\f''{y)-f{y)\ 



< 



{nh'^Y 



-l + -^»(A(/),Af(/))r(g) 

g ^2 C2 



where r(-) is the well-known Gamma function. By definition of bh and Mh 
respectively defined in (5.3) and (5.4), the assertion of Proposition 2 is proved : 



V.f\f\y)~f{y)\\G<C;{A{f),M{f)) 



1 V Ld nh^+'^ 
nh<^ 
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where 



C*(a,m) 



432mD3(l + 6Dfc^) 



(a, m)r{q), a, m > 0. 



5.4. Proof of Theorem 2.2 

By definition oih = {Ln)~v+^ and we have 

Applying the inequality given in Remark 4, we come to the assertion of the 
theorem. ■ 



5.5. Proof of Theorem 3.3 

This Proof is based on the Lepski scheme developed by [13] and adapted for 
the bandwidth selection by [16]. We start the proof with formulating auxiliary 
Lemmas whose proofs are given in Appendix (Section 7). Define 



h* = 



n ( 1 + 



{h + d){P + d) 
where the positive constant c is chosen as follows 



In? 



c<[lM/{Ld)] [1A4/Af(/)] 



d-l 



1 A 



A 



lUMDh 



P + d 

and let the integer k be defined as follows. 

The definitions of h* and n imply the following Lemmas. 
Lemma 3. 



1 A 



QAD 



2 1 



1 + 6Z)2 



(5.16) 



E/|/^'Hy)-/(y)riG<a 



(l + fcln2)'i 



c(/3 + d) 



Vfc > K, 



Lemma 4. For any f e W.d{l3, L, M, A) and any k > k + 1 

P/(fc = k,G)< J2<B(yl,Af)exp{Jin(r)^+''}2-('=-i)(«'f^+4\ 
where Ji = Ld{l + QDl)/QADl and J2 = (1 - 2-(«9''+4))-i. 
Lemma 5. There exists a universal constant i9 > such that 



lim sup sup exp ■ 

n-J-oo f£Wd(flX,M,A) 



16Afl?2£)2 



G") = 0. 
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Proof of Theorem 3.3. We decompose the risk as follows 

Kf\f^^\y)~m\'lG 

< - f{y)\\^^^^+Ef\PHy) - m\\^^^a 

= i?i(/) + i?2(/). (5.17) 
First we control Obviously 

If'^y) - f{y)\ < \f^'\y) - f^^\y)\ + - f{y)\- 

Note that the realization of the event G implies M < 3M(/)/2. This together 
with the definition of k yields 

\PHy) - f^Hy)\h<.,G < C^Sni^), Sn{k) = (1 + fcln2)'(n/j^)-^ 
where C = 288M DfX^^ {32qd + 16). In view of Lemma 3 we also get 

E/|/^"Hy)-/(y)r <c,s„(k). 

Noting that the right hand side of the obtain inequality is independent of / and 
taking into account the definition of k and h* we obtain 

limsup sup 0-9(/3)i?i(/) < cx). (5.18) 

n-foo feB.i{l3,L,A,M) 

Now let us bounded from above i?2. Applying Cauchy-Schwartz inequality we 
have in view of Lemma 4 



< J2{Ef\f('Hy)-f{y)\y/'^Vf{k^k,G} 

k>K 

= A(r)5^(E/|/(''0(y)-/(y)|''^)^/'2-('=-i)(4«'^+2)^ (5^9) 

where we have put A{h*) = J2^{A, M) exp {Jin{h*)P+'^} . We obtain from 
Lemma 3 and (5.19) 

i?2(/) < J3 {nhi,,)-''exp{JMh*f^'}, (5.20) 

where 



J3 = J2*8(A, M) 24«''+2 Cllj^ ^(1 + sln2)«2'^ 



s>0 

It remains to note that the definition of h* implies that 



limsup <?!)-«(/3)(n<^^)-«exp{Ji?i(r)'^+''} < 00 
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and that the right hand side of (5.20) is independent of /. Thus, we have 
hnisup sup (f)~''{f3)R2{f) < oo. 

ri^oo f(£md{l3X,A,M} 

that yields together with (5.17) and (5.18) 

hmsup sup cl,;;\f3)Ef\p\y)~f{y)\\G<^- 

n-foo fmdil3.L,A,M) 

To get the assertion of the theorem it suffices to show that 

hmsup sup 0-«(/3)E/|/W(y)-/(y)|%. <oo. (5.21) 

n-i-oo f(£nd{P,L,A,M) 

Note that f^^\y) < 4M in view of (3.4). Note also that the local least square 
estimator 5 is linear function of observation y^"^ and, moreover < < M, i = 
l,...,n. This together with the definition of M, (expression (3.3)) allows us to 
state that there exist < J4 < +00 such that |/'-'^'(2/) — /(y)| < J4M. Here we 
also have taken into account that ||/||oo < M. 
Finally we obtain 

and (5.21) follows now from Lemma 4. | 



6. Proofs of lower bounds 

The proofs of Theorems 2.1 and 3.2 are based on the following proposition. 

Put 0,1(7) = [n-i(l + (fe-7)lnn)]^, 76 (0,6] and let 



Ri^\Lv)^ sup Ef 



Cn«)|/(y)-/(y)r 



+ sup E f 

feMa{P.L.M,A) 

where f > and a,(3 e (0, 5]^. 

Proposition 3. Let ^I^ be admissible family of normalizations such that 

?/'„(a)/0„(a) > 0. 

n— >-oo 

Then, for any < v < {(3 - a)/{/3 + l){a + 1) 

liminf inf R\^\f,v) > 0. 

n— >-00 j: 



The proof is given in section 6.3. 
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6.1. Proof of Theorem 2.1 

Using the proposition 3 for /3 = a, we have to choose v — Q and one gets 

i?„,,[Hd(/3,L,M,A)] = i?i'n/,0) 

sup \f{y) ~ /(y)|«l > 0, V/. 

/eHd(a,L,M,A) L J 



6.2. Proo/ of Theorem 3.2 

I. To proof of the first assertion of the theorem it suffices to consider the 
family {fn(,S)}^g(Q_j,] , where u„(a) = (pn{a) and U(/3) = 1 for any /3 7^ a. 
The corresponding attainable estimator is the estimator being minimax on 
Hd(a,L,M,A). 

II. Let us consider the family {0n(/3)}^g]Q [,], which is admissible in view of 
Theorem 3.3. First, we note that 7 = 6 is not possible since 4>n{b) = ^Pn{b) the 
minimax rate of convergence on 11^(6, L, A/, A). 

Thus we assume that 7 satisfying (3.2) belongs to 6 e]0, b[. Let /* be a ^I/^")- 
attainable estimator. Since 1pn{c^) / (j^nio) — ?► 0, n 00 in view of (3.2) then 
obviously 



lim sup sup E y 

Tt->-oo /eHi(7,L,Af,A) 



C^(7)|/*(y)-/(y)r' 



= 0. 



Therefore, applying Proposition 3 with u = we have for any /3 < 7 



lim sup sup E / 



'^,T''(/3)|/*(y)-/(y)| 



> 0. 



We conclude that necessarily tpn{P) > 4>n[l3) for any /3 < 7. 

Moreover for any/? > 7 applying Proposition 3 with an arbitrary < z; < 
{P - -/)/{l3 + 1)(7 + 1) we obtain that 

V'«(/3) > n>„(/3), /?>7- 

It remains to note that the form of rate of convergence proved in Theorem 2.1 
implies that 

0n (7) (7) = o ( [In n] ^ 
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6. 3. Proof of Proposition 3 

Let >f > the parameter whose choice will be done later. Put 

1 + (/3 - Q;)lnn\ ^ 



Later on without loss of generality we will assume that L > 1. 
Consider the functions : /o = 1 and 

hix) = 1-{L- 1),.*</.„(«)F (^^, , a: e [0, 1]'^. 

Here F is a compactly supported positive function belonging to IHId(a, 1, Af, A) 
such that F(0) = 1 = max^; F{x). 

It is easily seen that /i S IHIrf(a, L, M, A). Therefore, we have 



> Eo 



n-^^-\P){f{y)-l) 



El 
El 



0-i(«)(/(?y)-/i(y)) 



where z = (i - l)>fSTTi^(o). Set 
A = ^-i(a)(l-/(y)), 



where g 



p+TTM^T)- Weget 

Rl?\f,v) > Eo|^„A|'+Ei|z-A|'' 

^{|A|>^/2} 



{|A|<2/2} 



> Eok, 



"2 I ^{|A|>z/2} ^IEl|2l ^{|A|<;^/2}- 

Noting that fi < fo, since F is positive, and putting c„(f(")) = I{|X|>2/2} we 
obtain 



Rl?Hf,v)><^?,-^ 



2^ nr=i/i(^o 







c„(a;)(ixi . . . dxn 







2« n:u/i(^oio 



We have 



1=1 



1 — Cn{x)dxi . . . dXn- (6.1) 



> 



{1-{L- l)>^^(/.„(a))"''" > e-(^-i)-n-(^-i)-('3-")(6.2) 
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We obtain in view of (6.1) and (6.2) 



1 



Cn{x)dxi . . . dXr. 



nr=i/i(^o7o 

I fhiXi) rfi{X„) 



2« UtlfliX^)Jo 



1 — Cn{x)dxi . . . dXn 



> — 
- 29 



Case 1 : (i = a. Choosing x — 1, and noting that <j„ — 1 and nr=i — 
e~''^~'^\ we deduce from (6.1) that yields : 



/ 29 



Case 2 : jS > a. Put 



q{g-v) -tri q 1 

>0, tn = hi — > 0. 



1 + (L - l)il3 - a) ' Inn (l + (/? - a) Inn)"^ " 

This choice provides us with the following bound 

^^^e'iL-D^n-^'^-DxiP-c) = {I + {(3 -a) lnn)"'^e-(^-i)'^n«('?-'')-(^-i)-(^-") 

> (1 + (;3 _ c,) Inn)"' Vi«(^-i)n*" > e-i^^^-^). 

This yields 

in/i?i'^(/;«)>^^^^^|^e-i^(-^)>0. 
/ 29 



7. Appendix 

7. 1 . Proof of Lemma 1 

Later on without loss generality we will suppose that nh'^ G N*. In order 
to simplify understanding of this proof, we note the approximation polynomial 
■^u = f9+u{nh-')-^{Xi), i = 1, . . . ,n for aU u e T„. 

1. Note that for u e T„ 

EfZ,Au)< n 7#T<e^^/^(^^ (7.1) 
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The first inequality is tlie consequence of the definition of Z^^e in (5-6) and the 
following calculation 

Efi^y^^ J = (y, < = 1 A 

In (7.1), the second inequality is obtained with classical inequality 1+p < e'', p € 
M and recall that fg{x) > f{x). 



TT -^0 TT i .4i-/(Ji) -\ ft. 



Case 1 : If \\ui — tt2|li > 1, the inequality (7.1) allows to get 



Case 2 : Assume now that — W2II1 < 1 and introduce the random events 
Fi = {Vz = l,...,n: < A^^ A AlJ , 

We have used the following notations : a/\b — min(a, h) and aVfe — max(a, 6), a, 6 € 
M. For any (-^1,^2) S we have 

Ef\Zh,e{ui) - Z,i^0(u2)| = E/|Zft^e(Mi) - Z,i^e(u2)| I[Fi] 

+Ef\Zh,e{ui) - Zh,e{u2)\ \f2] + E/|^/i,e(wi) - Zh,e{u2)\ \f3] 

= /Ci+/C2+/C3. (7.2) 

The following bound will be extensively exploited in the sequel. 

/.(a;) >2t.o,...,o-|k||i>0.25A(/), Vt- e e(A(/)/4, 9M(/)), x e [0, 1]'*. 

Control of /Ci. 



n 



/I* 



n 



4* 



/ 



and 



i- Xi^Vniy) 



fix,) 



(7.3) 



(7.4) 
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Therefore, using (7.1), we have 

^ h- n f^] n 

< e^''/-^(^) ( 1 - exp I y In ^^^-^"- I ] . (7.5) 

Remember that ^ -^La I - ('^/i'*)"^ll'"i ^ W2II1 and > A{f)/4. Let us 
give the foUowing calculation with inequality of finite increments for In(-) 

In - - |ln-4L, A Al^ - In^',^ V | > Jj^l^^Tll}^!^ 

At \/ A% I ^1 ^2 i^i "2 1 — At' t\ At 

Using last inequalities, (7.3), (7.4), (7.5), last inequality and the well known 
inequality 1 — e^'' < p, we have 

/Ci<-;^e-^'^M(/)||^i-U2||i. 

Control of /C2. We could rewritten 

F2 = {Vi = l,...,n: <^L^ V^;j 

\{Vz = l,...,n: r,<^^^A^:,J 
= G\Fi. 



and define 

^1 - {X, e Vh{y) : Al, V < /(X,)} , 
Q2 = {X, e Vhiy) : AL, A < fiXt)} • 

Note that Fi C G and, therefore, 

n -^0 ( n n 

11 At A \ 11 frr.^ 11 



The definition of Q2 implies 



n Ai A Ai - Y\. Ai A Ai IT ffY- 
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Since Gi C ^2, ll^i - U2II1 < 1 and |/„(a;)| < ||m||i, Vx G [0, l]'^, Vu G T„, using 
the last inequality and (7.1), we obtain 

^2 < 11 77^ 11 -1 

< 4Dte'+^^/^^f^\\ui-U2\\i/A{f). 
Control of IC3. We can rewritten the process Zh,e with the notation A^^ 

^mh= n #%<^Li- 

Under the event F3, we get 

\Zh,e{ui) - ZhM{u2) \ I[F3] = 

Then /Ca = 0. 



The first assertion of the lemma is proved with (7.2) and the bounds of /Ci, 
IC2 and JC3. 

2. For any u € T„, since the random variables (li)i are independent we have, 



%^^?(") = n 

i:X,eVh{y) 




For any z, we have 





1 A 



< 



An 



fix,) ^ ^ 



/ / Ai 

•^0 V -^ti 



Remind that in view of (5.5) fe{x) > f{x) and < /e(x) < 3Af(/) for a; £ Vh(?/). 

Moreover, for u € T„ = n/i'^(e(A(/)/4, 9Af (/)) - 0) , < /9+„(„„<i)-i (a:) < 
9M(/). Thus for all i : X, € Vhiy), 




The last inequality implies 



•^0 ^ V -^Ji 



< 



1 - 



1 1/2 



E/^^i'(") 



TT -^0 /-, _ l/n(nfe'')-i(-'^i)l 

f{Xi)\l QM(f\ 



i-X,eVu{y) 



9Mif) 



exp 



'lSM(f ) nh'i 



J2 |/u(XO| I. (7.6) 
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It remains to show 



j^d E \fu{X^)\ > Xn{h)D;'\\u\U. 



(7.7) 



i: X,<£Vh{y) 



Let us remember that u = {up, p G Vt) (where Vb is defined in (1.4)). First, we 
get from the definition of /„ 



h 



= K 



fu{x) ^uK 
and, tlicrcfore. 



u^, Vxe [0,1]^ 



iiXieVniv) 



i.Xi<£Vh{y) 



Assume u ^ and put v — Noting that |/t,(x)| < 1, Va: G [0, l]"*, we 

have 



> V 



i:Xi&u(v) 
1 



h 



\fv{X,)\ 



\\u\\inh'^ 



E 



i-X^eVhiy) 



T ( Xi-y\ f X^-y 



> 



Ml 



K 



T f Xt-y\ ^ f X,-y 



t.XieVh(y) 



The bound (7.7) follows now from Lemma 2. The assertion of the lemma follows 
from (7.6) and (7.7). 



3. In view of Lemma 1 (1), we have 

Ef\Zi,^e{u) - Zm(0)| < C^ll^illi, u e T„\0. 



(7.8) 



Taking into account that Zh^e{0) = 1 we obtain applying (7.8), Fubini's theorem 
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and Tchebychev inequality 



<Ff\[ ■■■ [ \ZhAv) - Zh.em\dv > h""^] 



Jo 











Jo 


Jo 



7.2. Proof of Lemma 2 

First step : Mnhiu) is a nonnegative positive matrix. 
Let "Hnjn > 1 is defined in (5.1). First, we prove that 

inf XJh) > 0, Vn > 1. (7.9) 

Suppose that 3ni > 1, /i„j G Hm such that A„j(/i„j) = 0. Recall that ft{x) = 
t K{h-^{x - y)) for all t e and note that Vt e E^" 



"1 



/ ^ N n 2 



"1 ^■.XieVh„^{y) 

Since A„j (^ni) is the smallest eigenvalue of the matrix Aimhn-^ [v) the assump- 
tion Xni{hni^ — implies that there exist r* belonging to the unit sphere of 
K^" such that 

It obviously implies that fr-iXi) = for all Xi £ Vh^_^ (y). It remains to note 

that nhi^ > {b + l) since h„^ G Hn and to apply the result obtained in [17] 
(page 20). It yields t* — and the obtained contradiction proves (7.9). 

Second step : Mnhiu) "~^°°> M- 
Let Ao be the smallest eigenvalue of the matrix 



M= K^{x)Kix)dx 

J\~l/2,l/2]'i 
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whose general term is given by 



Let us prove that 



j = l 2 



hmsup sup |A„(/i) — Ao| = 0. 



(7.10) 



Put m = n^/'^ and without loss of generality we will assume that m is integer. 
Remind that the general term of the matrix Mnhiy) is given by 



1 



ip,q X! n 



Pj+li 



I I.I I. " 

where Xi. — ij / m iov aW j = 1, . . . , d and Xi = {Xi^ , Xi^. We get 



1 



< 



E n 

»:XiGVh(y)j = l-'''J" 
1 



dx-i 



n 



< 



E n 

d 

E n 



Pj+Qj 



It yields by change of variables that 

d .i ^ 



Xj/m — y.j 



Pj+i] 



dxi 



n 



E n 

i:XieVfc(a)j = l 
/.i+2(n/i'*)-i 



< 



(7.11) 



Note that nh'^ > Ini+'J {n) for any h e "Hn. This together with (7.11) yields 
limsup sup (Mnh{y))„- Mp,q ^0, < |g| < 6. 

The last result obviously imply (7.10). 

Third step : Conclusion. 
First we show that Ao > 0. Indeed, Vt e M.^*" 



t'^Mt= [ [frix)]'^dx>0. 



M. Chichignoud/ Locally bayesian approach 



36 



Since Aq is smallest eigenvalue of the matrix Ai the assumption Ag = would 
imply that there exists t* belonging to the unit sphere of M^'' such that /r» = 0. 
Since /r» is a polynomial the last identity is possible if and only if r* = 0. The 
obtained contradiction shows that Aq > 0. 

Next, note that in view of (7.10) there exists tiq such that Vti > uq and 

yh&n^, K{h) > Ao/2. 

On the other hand in view of (7.9) min„<„p inf/jg^^ A„(/i) > 0. It remains to 
define A > as 



A = min min inf A„(/i), Ao/2 



7.3. Proof of Lemma 3 

Remind that hk < < h* by definition of hk, h* and k (see (5.16)). Using 
Proposition 2 with h — hk, it yields 



Ef\f^''\y)-f{y)\\G < C;{Aif),M{f)) 



nhf j 



' ly Ldn{h*)'^+'^Y 
4 

The control of n{h*)^^'^ requires the following calculation. 



, a,-(,4,/,,„(/,)(l^«f^)'.(7.i2, 



'""•'°"^'+ (i,+y+.) '°"-'"-w ("3) 

where Pn{P) is the price to pay for adaptation defined in (1.9). By definition of 
hk, we have 

1 I 1 o 1 I 1 ^max \ -, I 1 ^max 

1 + K In 2 = 1 + In — > 1 + m — 

hk h* 

Using the classical inequality ln(l -\- x) < x and c < 1, we obtain with the last 
inequality 

^ PniP) < 1 + Kln2 < 1 + fcln2,Vfc > K. (7.14) 

p + d 

According to (7.12), (7.13) and (7.14), Lemma 3 is proved. | 



7.4. Proof of Lemma 4 

Note that for any fc > k + 1 and by definition of k in (3.8) 

{k = k}= Ui>k {\f^'-'\y) - f'^'Hy)\ > MSnil)} 
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Note that Sn{l) is monotonically increasing in I and, therefore, 

{fc-fc} C {|/('=-i)(y)-/(y)| >2-iAf5„(fc-l)} 

U,>fc{|/(')(y)-/(y)| >2-iM5„(/)}" . 



U 



Taking into account that the event G imphes the reahzation of the event M > 
M{f)/2 > A/2 we come to the foUowing inequaUty : for any k > k + 1 

V{k = k,G) < P { 1/(^-1) (2/)-/(y) I >4-^M(/)5„(fc-l), g} 

+ {\f'Hy) - f{y)\ > i-'M{f) Sn{l), g}.(7.15) 

l>k 

Now we go to justify the use of Proposition 1. Note that bh^ < Ldhf since 
/ € HdiP, L, A, M) and, therefore, by definition of h* , we have 

Nh, < Ldn{hif+'^ < Ldn{h^f+'^ < Ldn{h*f+'^ < cp„(/3), > fc - 1. (7.16) 

Remark that the definition of 5„(Z) yields 

nhf Snil) > 432i?3(32qd + 16)\-\hi) [l + In (h^.Jhi)] . 

Using (7.14), (7.16) and the last inequality, we have 

^^nhfSnil) > UAMDb{lWAfh,)/iXn{hi)A). (7.17) 

The last inequality allows us to apply Proposition 1 and Lemma 2 with 

£ = ^nhf Snil), we obtain V/ > fc - 1 

P{\f^'\y)- f{y)\>{M if)/ i) Sr.il), g] 
<miA,M)£ihi)[K^,^/hir^'^^-^ 

= <B(A,M)£(/i;)2-'(««''+'*'. (7.18) 

Here we have also used that k > k + 1. We obtain from (7.15), (7.18) and 
(7.16) that k> n+1 

V{k = k, G) < J2<B(y4,Af)exp{Jin(r)'3+'^}2-('=-i)(««'*+'^\ 

where J2 (1 - 2-(89<i+4)^-i_ I 

7. 5. Proof of Lemma 5 

Put for any p £ Vb 

WLiy) - P,1-PJ ^ K^iO) M-lJy) K Iv_(,)(X.), 
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and note that 6p = Y.ti V^M- 

The model (1.1) can be rewritten as 2Y^ = f{X^) + f{X^){2Ui - 1). Thus, 
puttmg F{X) - ifiX,))^^, „, V{X) = (/(X,)(2C/. - 1))^^^ „ and 



d\P\f(y) 



1'^. Deviations of M . By definition of M in (3.3), we obtain 

\M~M(f)\ < ||^-2?(/)||i < \\VF{X)-V{f)\\, + \\VV{X)\\,. 

Here V is Df, x n-matrix of general term Vpi = W^^{y) and is the ^i-norm. 
Let us prove that 



-/{|M-M(/)|>Af(/)/2} 



< 



exp ■ 



8^ 



(7.19) 



In view of the result proved in [7] and [22] there exist 1)1,^2 > such that 

\\VF{X)-Vif)\\, < ^i<;L^J, 

^2 



snp\WM\ < 



Remind that /imax "~^°°> 



and, therefore, 3no such that dih'LA'^^ < M{f)/A 
for any rt > np. Note that tiq can be chosen independent on / since M{f)/4 > 
A/4. Thus, we get 



-f{\M-M{f)\>M{f)/2} 



pGN'':0<\p\<P 



J2 f{X,){2U.-l)W::,iy) 



> 



M{f) 
4Dh 



Noting that \fiX,){2U,^l)W^Jy)\ < M{f)-^ "^^ 
lity [3] and the last inequality, we obtain 



- , applying Hoeffding inequa- 



peN<i: 0<|p|<^ 



J2 /(X,)(2C/.-lX,(y) 



> 



Mif) 
4Dh 



< Db exp 



MIDI 



Df, exp ■ 



mid! 



(7.20) 



Therefore (7.19) is proved. 

2°. Deviations of A. Since \ f{y) - A(/)| < Ldhl^^^^ < A(/)/4 for n > no one 
has 

{|i - A(/)| > A(/)/2} < {|Af - Mif)\ > Aif)/4} . 
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Repeating previous calculations we obtain 



.,{|i-.,/)|>.,/)/2} < 0...,,[-^lg^ 



< Dbexpl > ■ (7.21 

Since P/(G^) < P/(G^) + 1P/(G^^) the assertion of the lemma follows from 
(7.20) and (7.21). | 
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